SMALL PROBABILITY, INVERSE THEOREMS, AND APPLICATIONS 
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Abstract. Let ^ be a real random variable with mean zero and variance one and A = 
{ai, . . . , an} be a multi-set in R''. The random sum 

Sa ffll^l + • ■ ■ + dn^n 

^ where are iid copies of ^ is of fundamental importance in probability and its applications. 

^ We discuss the small ball problem, the aim of which is to estimate the maximum prob- 

ability that Sa belongs to a ball with given small radius, following the discovery made by 
Littlewood-Offord and Erdos almost 70 years ago. We will mainly focus on recent devel- 
opments that characterize the structure of those sets A where the small ball probability is 
relatively large. Applications of these results include full solutions or significant progresses 
of many open problems in different areas. 
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1. Littlewood-Offord and Erdos estimates 



Let ^ be a real random variable with mean zero and variance one and A = {oi, . . . , a„} be 
a multi-set in R (here n — )• oo). The random sum 



Sa ■= «i6 H \- anin 

where are iid copies of ^ plays an essential role in probability. The Central Limit Theorem, 
arguably the most important theorem in the field, asserts that if the a^'s are the same, then 



Sa_ 



a, 
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N(0,1). 



Furthermore, Berry-Esseen theorem shows that if has bounded third moment, then the 
rate of convergence is 0(n~^/-^). This, in particular, implies that for any small open interval 
/ 



¥{SAeI) = 0{\I\/n^'^). 

The assumption that the Oj's are the same are, of course, not essential. Typically, it suffices 
to assume that none of the aj's is dominating; see [13j for more discussion. 

The probability P(S'a G I) (and its high dimensional generalization) will be referred to as 
small hall probability throughout the paper. In 1943, Littlewood and Offord, in connection 
with their studies of random polynomials [33], raised the problem of estimating the small 
probability for arbitrary coefficients Oj. Notice that when we do not assume anything about 
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the Oj's, even the Central Limit Theorem may fail, so Berry-Esseen type bounds no longer 
apply. Quite remarkably, Littlewood and Offord managed to show 

Theorem 1.1. IfS^ is Bernoulli (taking values ±1 with probability 1/2) and ai have absolute 
value at least 1, then for any open interval I of length 2, 



n 



Shortly after Littlewood-Offord result, Erdos [T^ gave a beautiful combinatorial proof of 
the following refinement, which turned out to be sharp. 

Theorem 1.2. Under the assumption of Theorem \l.l\ 



e /) < ^ = o{-^). (1) 

Proof, (of Theorem 1.2) Erdos' proof made an ingenious use of Sperner's lemma, which 
asserts that if J- is an anti-chain on a set of n elements, then F has at most (|^„'/2j) elements 
(an anti-chain is a family of subsets none of which contains the other). Let x be a fixed 
number. By reversing the sign of Oj if necessary, one can assume that aj > 1 for all i. Now 
let F be the set of all subsets X of [n] := {1, 2 . . . , n} such that 



ai — G (x — 1, X + 1). 

One can easily verify that J- is an anti-chain. Hence, by Sperner's lemma. 



\F\ < 



(n/2) 
2" 



completing the proof. 



□ 



The problem was also studied in probability by Kolmogorov, Rogozin, and others; we refer 
the reader to [30', 131] and Erdos' result is popular in the combinatorics community 

and has became the starting point for a whole theory that we now start to discuss. 



Notation. We use the asymptotic notation such as O, o, under the assumption that 
n — )• 00; Oa(l) means the constant in big O depends on a. All logarithms have natural 
base, if not specified otherwise. 



2. High dimensional extenstions 



Let ^ be a real random variable and A = {ai, . . . , a^} a multi-set in R"^, where d is fixed. 
For a given radius R > 0, we define 
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Pd,R,iiA) := sup P(ai^i H h a^Cn e B{x,R)), 

where are iid copies of ^, and B(x, i?) denotes the open disk of radius R centered at x in 
R'^. Furthermore, let 



P{d, R, t n) := sup pd,R,i^{A) 
A 

where A runs over all multi-sets of size n in R"^ consisting of vectors with norm at least 1. 
Erdos' theorem can be reformulated as 



/Ttn. V ^ 



p{l,l,Ber,n) ^ 

In the case d = 1, Erdos obtained the optimal bound for any fixed R. In what follows we 
define s := [R\ + 1. 

Theorem 2.1. Let S{n,m) denote the sum of the largest m binomial coefficients ("),0 < 
i < n. Then 



p{l,R,Ber,n) = 2-''S{n,s). (2) 

The case d > 2 is much more complicated and has been studied by many researchers. In 
particular, Katona [23] and Kleitman [25] showed that p{2,l, Ber, n) = (|^„/gj) • This 
result was extended by Kleitman [26j to arbitrary dimension d, 

p(d,l,Ber,n)='-^. (3) 

The estimate for general radius R is much harder. In [27], Kleitman showed that 2^p{2, R, Ber, n) 
is bounded from above by the sum of the 2[i?/\/2j largest binomial coefficients in n. For 
general d, Griggs [19] proved that 

p{d,R,Ber,n)<2''-'-'\R^]^-^. 
This result was then improved by Sali |48| H9] to 



' n ^ 



p{d,R,Ber,n) < E'^lRVd] 



2n 

A major improvement is due to Frankl and Fiiredi [TJj, who proved 
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Theorem 2.2. For any fixed d and R 

p{d,R,Ber,n) = (1 + o(i ))^-"5(n, s). (4) 

This result is asymptotically sharp. In view of (§ and ([3]), it is natural to ask if the exact 
estimate 

p{d,R,Ber,n) = 2''''S{n,s), (5) 

holds for all fixed dimension d. However, this has turned out to be false. The authors of 
[261 E] observed that ^ fails if s > 2 and 

R > 1)2 + 1. (6) 

Example 2.3. Take vi = ■ ■ ■ = Vn-i = ei and Vn = ^2, where ei,e2 are two orthogonal 
unit vectors. For this system, let B he the hall of radius R centered at v = {vi + • • • + Vn)/'^- 
Assume that n has the same parity with s, then by definition we have 

P(5y G B{v, R)) = 2 ("7 ^) /2" > 2""'5(n, s). 

(n-s)/2<i<(n+s)/2 ^ ^ 

Frankl and Fiiredi raised the following problem. 

Conjecture 2.4. [HI Conjecture 5.2] Let R,d he fixed. If s - 1 < R < ^/{s -1)2 + 1 and 
n is sufficiently large, then 



p{d,R,Ber,n) = 2~'^S{n,s). 

The conjecture has been confirmed for s = 1 by Kleitman (see ([S])) and for s = 2,3 by 
Frankl and Fiiredi [14j (see [14t Theorem 1.2]). Furthermore, Frankl and Fiiredi showed 
that ([5]) holds under a stronger assumption that s — \ <R<{8 — \) + A few years 



ago, Tao and the second author proved Conjecture 2.4 for s > 3. This, combined with the 



above mentioned earlier results, established the conjecture in full generality [66| . 

Theorem 2.5. Let R,d he fixed. Then there exists a positive numher uq = nQ{R,d) such 
that the following holds for all n > uq and s — 1 < R < \/ {s — 1)^ + 1 



p{d,R,Ber,n) = 2-''S{n,s) 



We will present a short proof of Theorems 2.2 and 2.5 in Section 17 



3. Refinements by restrictions on A 

A totally different direction of research started with the observation that the upper bound 
in ([1]) improves significantly if we make some extra assumption on the additive structure of 
A. In this section, it is more natural to present the results in discrete form. In the discrete 
setting, one considers the probability that Sa takes a single value (for instance, P(S'yi = 0)). 
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Erdos's result in the first section implies 

Theorem 3.1. Let at he non-zero real numbers, then 

supP(5^ = x)<^ = 0(n-V2). 

Erdos and Moser [IT] showed that under the condition that the Oj are different, the bound 
improved significantly. 

Theorem 3.2. Let a,- be distinct real numbers, then 



supP(S'A = x) = 0(n~^/^ log n). 

a;eR 

They conjectured that the logn term is not necessary. Sarkozy and Szemeredi's |50j con- 
firmed this conjecture 

Theorem 3.3. Let Oj be distinct real numbers, then 

PA := supP(5A = x) = 0(n-3/2). 

In [53], Stanley found a different (algebraic) proof for a more precise result, using the 
hard-Lepschetz theorem from algebraic geometry. 

Theorem 3.4 (Stanley's theorem). Let n he odd and := { — • • • > ^^}- Let A be 

any set of n distinct real numbers, then 



p{A) := sup P(Syi = x) < sup P(5'yi(, = x). 
xeR igR 

A similar result holds for the case n is even, see [53]. Later, Proctor j41j found a simpler 
proof for Stanley's theorem. His proof is also algebraic, using tools from Lie algebra. It is 
interesting to see whether algebraic approaches can be used to obtain continuous results. 



(For the continuous version of Theorem 3.3 see Section ro 



A hierarchy of bounds. We have seen that the Erdos' bound of 0(n~^/^) is sharp, if we 
allow the ai to be the same. If we forbid this, then the next bound is 0(n~^/^), which can 
be attained if the Oj form an arithmetic progression. Naturally, one would ask what happen 
if we forbid the to form an arithmetic progression and so forth. Halasz' result, discussed 
in Section [6] , gives a satisfying answer to this question. 

Remark 3.5. To conclude this section, let us mention that while discrete theorems such 



as Theorem 3.4 are formalized for real numbers, it holds for any infinite abelian groups, 
thanks to a general trick called Freiman isomorphism (see |67j and also Appendix \A^. In 
particular, this trick allows us to assume that the ai 's are integers in the proofs. Freiman 
isomorphism, however, is not always applicable in continuous settings. 
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4. LiTTLEWOOD-OfFORD TYPE BOUNDS FOR HIGHER DEGREE POLYNOMIALS 

For simplicity, we present all results in this section in discrete form. The extension to 
continuous setting is rather straightforward, and thus omitted. 

One can view the sum S = oi^i + • • • + a^Cn as a linear function of the random variables 
^1, . . . , It is natural to study general polynomials of higher degree k. Let us first consider 
the case k = 2. Following [8], we refer to it as the Quadratic Littlewood-Offord problem. 

Let be iid Bernoulli random variables, let A = (aij) be an n x n symmetric matrix of real 
entries. We define the quadratic concentration probability of A by 

Pq{A) := supP(Vaij^i^j = a). 

Similar to the problem considered by Erdos and Littlewood-Offord, we may ask what upper 
bound one can prove for Pq{A) provided that the entries aij are non-zero? This question 
was first addressed by Costello, Tao and the second author in motivated by their study 
of Weiss' problem concerning the singularity of a random symmetric matrix (see Section [5|. 

Theorem 4.1. Suppose that aij ^ for all 1 < i, j < n. Then 

p,{A) = 0(n-V8). 



The key to the proof of Theorem 4.1 is a decoupling lemma, which can be proved using 
Cauchy-Schwarz inequality. The reader may consider this lemma an exercise, or consult |8] 
for details. 

Lemma 4.2 (Decoupling lemma). Let Y and Z be random variables and E = E{Y,Z) be 
an event depending on Y and Z. Then 



P{E{Y, Z)) < V{E{Y, Z) A E{Y', Z) A E{Y, Z') A E{Y\ Z'))^'^ 

where Y' and Z' are independent copies of Y and Z, respectively. Here we use A f\ B to 
denote the event that A and B both hold. 



Consider the quadratic form Q{x) := Ylij o-ijii^j-, arid fix a non-trivial partition {1, . . . , n} = 
?7i U ?72 and a non-empty subset S of J7i. For instance one can take Ui to be the first half 
of the indices and U2 to be the second half. Define Y := {£,i)i<^Ui and Z := {Ci)i£U2- We 
can write Q{x) = Q{Y, Z). Let ^- be an independent copy of ^i and set Y' := (^Qjgc/i and 



Z' := (Ci)jel/2)- Lemma 4.2 for any number x 



P(Q(y, Z) = x)< P{Q{Y, Z) = Q{Y, Z') = Q{Y', Z) = Q{Y', Z') = x)^'\ 
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On the other hand, if Q{Y, Z) = Q{Y, Z') = Q{Y' , Z) = Q{Y' , Z') = x then regardless the 
value of X 

R := Z) - Q{Y\ Z) - Q{Y, Z') + Q{Y' , Z') = 0. 
Furthermore, we can write R as 



where Wi is the random variable Wi ■=£,% — Cii and Ri is the random variable YljeU2 



We now can conclude the proof by applying Theorem |3.1| twice. First, combining this 
theorem with a combinatorial argument, one can show that (with high probability), many 
Ri are non-zero. Next, one can condition on the non-zero Ri and apply Theorem |3.1| for 
the linear form J2ieUi ^i'^i to obtain a bound on P(i? = 0). 



The upper bound n in Theorem 4.1 can be easily improved to n ^1^. The optimal 
bound was obtained by Costello [7| using, among others, the inverse theorems from Section 

El 

Theorem 4.3 (Quadratic Littlewood-Offord inequality). Suppose that aij 7^ 0, 1 < i, j < n. 
Then 



The exponent 1/2-1- o(l) is best possible (up to the 0(1) ter m) as demonstrated by the 
quadratic form "^i j iiij = {Yll=i Both Theorems |4. l| and 4.3 hold in a general setting 



where the ^i are not necessary Bernoulli and only a fraction of the flij's are non-zero. 

One can extend the argument above to give bounds of the form n~'^'= for a general polynomial 
of degree k. However, due to the repeated use of the decoupling lemma, Cfc decreases very 
fast with k. 

Theorem 4.4. Leet f he a multilinear polynomial of real coefficients in Ti variables i^]^ , . . . , 
with m X n^^"^ monomials of maximum degree k. If ^i are iid Bernoulli random variables, 
then for any value x 



P(/ = x)= 0(m~ 



2(fc2+ft)/2 



By a more refined analysis, Razborov and Viola f32] recently obtained a better exponent of 
order roughly ^ (see Section 16). On the other hand, it might be the case that the bound 
holds for all degrees k > 2, under some reasonable assumption on the coefficients 
of the polynomial. 



Quadratic (and higher degree) Littlewood-Offord bounds play important roles in the study 
of random symmetric matrices and Boolean circuits. We will discuss these applications in 
Sections [5] and [T6j respectively. 
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5. Application: Singularity of random Bernoulli matrices 

Let Mn be a random matrix of size n whose entries are iid Bernoulli random variables. 
A notorious open problem in probabilistic combinatorics is to estimate pn, the probability 
that Mn is singular (see [231 EZ] for more details) . 

Conjecture 5.1. p„ = (1/2 + o{l)y. 

To give the reader a feeling about how the Littlewood-Offord problem can be useful in 
estimating pn, let us consider the following process. We expose the rows of M„ one by 
one from the top. Assume that the first n — 1 rows are linearly independent and form a 
hyperplane with normal vector v = (oi, . . . , a„). Conditioned on these rows, the probability 
that Mn is singular is 

P(X • V = 0) = P{ai^i + ■■■ + anin = 0), 
where X = (^i, . . . , is the last row. 

As an illustration, let us give a short proof for the classical bound pn = o(l) (first showed 
by Komlos in [28j using a different argument). 

Theorem 5.2. p„ = o(l). 

We with a simple observation [23j. 

Fact 5.3. Let H he a subspace of dimension 1 < d < n. Then H contains at most 2'^ 
Bernoulli vectors. 

To see this, notice that in a subspace of dimension d, there is a set of d coordinates which 
determine the others. This fact implies 

n—l n—1 „ 

Pn < 5]P(x,+i G H,) < ^2-" < 1 - -, 

i=l i=l 

where Hi is the subspace generated by the the first i rows xi , . . . , Xj of M„ . 

This bound is quite the opposite of what we want to prove. However, we notice that the 
loss comes at the end. Thus, to obtain the desired upper bound o(l), it suffices to show 
that the sum of the last (say) log log n terms is at most (say) i/3„ - ^o do this, we will 
exploit the fact that the Hi are spanned by random vectors. The following lemma (which 
is a more effective version of the above fact) implies the theorem via the union bound. 

Lemma 5.4. Let H be the subspace spanned by d random vectors, where d> n — log log n. 
Then with probability at least 1 — ^, H contains at most ^ Bernoulli vectors. 

We say that a set 5 of d vectors is k-universal if for any set of k different indices 1 < 
ii, ■ ■ ■ ,ik ^ n and any set of signs ei, . . . , (e^ = ±1), there is a vector F in 5 such that 
the sign of the ij-th. coordinate of V matches €j, for all i < j < k. 
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Fact 5.5. If d > n/2, then with probability at least 1 — -, a set of d random vectors is 
k-universal, for k = logn/10. 

To prove this, notice that the failure probabiUty is, by the union bound, at most 



If S is /c-universal, then any non-zero vector v in the orthogonal complement of the subspace 
spanned by S should have more than k non-zero vectors (otherwise, there would be a vector 
in S having positive inner product with v). If we fix such v, and let x be a random Bernoulli 



vector, then by Theorem 3.1 



P(x G span(5)) < P(x • v = 0) = 0{^) = o{:^^^^j^), 



proving Lemma 5.4 and Theorem 5.2 



The symmetric version of Theorem |5.2| is much harder and has been open for quite sometime 
(the problem was raised by Weiss the 1980s). Let pti'"^ be the singular probability of a 
random symmetric matrix whose upper diagonal entries are iid Bernoulli variables. Weiss 
conjectured that pn^"^ = o{l). This was proved by Costello, Tao, and the second author 
[H] ■ Somewhat interestingly, this proof made use of the argument of Komlos in [5H] which 
he applied for non-symmetric matrices. Instead of exposing the matrix row by row, one 
needs to expose the principal minors one by one, starting with the top left entry. At step 
i, one has a symmetric matrix Mj of size i and the next matrix Mj+i is obtained by adding 
a row and its transpose. Following Komlos, one defines Xi as the co-rank of the matrix at 
step i and shows that the sequence Xi behaves as a bias random walk with a positive drift. 
Carrying out the calculation carefully, one obtains that X„ = with high probability. 

The key technical step of this argument is to show that if Mi has full rank than so does 
Mj_|_i, with very high probability. Here the quadratic Littlewood-Offord bound is essential. 
Notice that if we condition on Mi, then det(Mj_|_i) is a quadratic form of the entries in the 
additional {{i + l)-th) row, with coefficients being the co- factors of Mj. By looking at these 
co-factors closely and using Theorem |4.1| (to be more precise, a variant of it where only a 
fraction of coefficients are required to be non-zero), one can establish Weiss' conjecture. 

Theorem 5.6. 

Getting strong quantitative bounds for pn and pn^"* is more challenging, and we will continue 



this topic in Sections [13] and [I4j after the introduction of inverse theorems. 

6. Halasz' results 



In [21] (see also in |67j). Halasz proved the following very general theorem. 
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Theorem 6.1. Suppose that there exists a constant 5 > such that the following holds 



• (General position) for any unit vector e in R'^ one can select at least 6n vectors 
with Kofc, e)| > 1; 

• (Separation) among the n'^ vectors b of the form ±0^^ it • • • ± a^^ one can select at 
least 5n'^ with pairwise distance at least 1. 

Then 



PdXBeM) = 05,d(n-3'^/2). 

Halasz' method is Fourier analytic, which uses the following powerful Esseen-type concen- 
tration inequality as the starting point (see [2T].|12j). 

Lemma 6.2. There exists an absolute positive constant C = C{d) such that for any random 
variable X and any unit ball B C R"^ 

P(X G B) < C7 i |E(exp(i(t,X)))| dt. (7) 

^IUll2<l 



Proof, (of Lemma 6.2) With the function k{t) to be defined later, let K{x) be its Fourier's 
transform 



K{x)= I exp{i{x,t))k{t)dt. 



Let H{x) be the distribution function and h{x) be the characteristic function of X respec- 
tively. By Parseval's indentity we have 



K{x)dH{x) = / k{t)h{t)dt. (8) 



If we choose k{t) so that 



\k{t) = for ||t||2 > 1, 
\\k{t)\ < ci for ||t||2 < 1, 

then the RHS of ([s]) is bounded by that of ([T]) modulo a constant factor. 
Also, if 
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K{x) > 1, ||x||2 < C2, for some constant C2, 
K{x) > for ||x||2 > C2, 

then the LHS of Q is at least i||^||2<c2 dH{x). 

Similarly, by translating K{x) (i.e. by multiplying k{x) with a phase of exp{i{tQ, x)), we 
obtain the same upper bound for i||x.-to||2<c2 dH{x). Thus, by covering the unit ball B with 
balls of radius C2 , we arrive at for some constant C depending on d. 

To construct k{t) with the properties above, one may take it to have the convolution form 



k{x) := / ki{x)ki{t — x)dx, 
where ki{x) = 1 if ||x||2 < 1/2 and ki(x) = otherwise. 

□ 



To illustrate Halasz' method, let us give a quick proof of Erdos bound 0(n^^/^) for the 
small ball probability pi^i^seriA) with A being a multi-set of n real numbers of absolute 
value at least 1 . In view of Lemma |6.2[ it suffices to show that 



„ n 

/ |E(exp(itVa,C,)|) (it = 0(l/ 

■''l<l<i 7^ 



n 



By the independence of the .^j, we have 



|E(exp(it QjCj))l = JJ |E(exp(itoj^j)| = | JJ cos(ta_,-)|. 
j=i j=i j=i 

By Holder's inequality 

„ n n „ 

/ |E(exp(ft Voj^j))! dt<Y\{ I cos(taj)|" (it)^/". 

But since each aj has magnitude at least 1, it is easy to check that i|(|<i I cos(taj)|"' dt = 
0(l/-y/n), and the claim follows. 

Using Halasz technique, it is possible to deduce 

Corollary 6.3. [67, Corollary 7.16] Let A be a multi-set in R. Let I be a fixed integer and 
Rl be the number of solutions of the equation Oj^ + • • • + = + • • • + aj, . Then 
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PA ■■= supP(5A = x) = 0{n-''—^Ri). 

X 

This result provides the hierarchy of bounds mentioned in the previous section, given that 
we forbid more and more additive structures on A. Let us consider the first few steps of 
the hierarchy. 

• If the aj's are distinct, then we can set / = 1 and Ri = n (the only solutions are the 
trivial ones Oj = aj). Thus, we obtain Sarkozy-Szemeredi's bound 0(n~^/^). 

• If we forbid the Oj's to satisfy equations Oj + Oj = ai + a^, for any {i, j} 7^ {k^l} (in 
particular this prohibits A to be an arithmetic progression), then one can fix Z = 2 
and i?2 = and obtain pA = 0(n~^/^). 

• If we continue to forbid equations of the form a/^ + Oj + aj = + a; + a^, {h,i,j} / 
{k,l,m}, then one obtains pA = 0(n^'^/^) and so on. 

Halasz' method is very powerful and has a strong influence on the recent developments 
discussed in the coming sections. 



7. Inverse theorems: Discrete case 

A few years ago, Tao and the second author [60] brought a new view to the small ball 
problem. Instead of working out a hierarchy of bounds by imposing new assumptions as 
done in Corollary |6.3[ they tried to find the underlying reason as to why the small ball 
probability is large (say, polynomial in n). 

It is easier and more natural to work with the discrete problem first. Let A be a multi-set 
of integers and E, be the Bernoulli random variable. 

Question 7.1 (Inverse problem, |[60j). Let n — )• 00. Assume that for some constant C 

PA = supP(5'^ = x) > . 

X 

What can we say about the elements ai,. . . ,an of A ? 

Denote by M the sum of all elements of A and rewrite Yli (^iii as M — 2 aj- As j4 

has 2" subsets, the bound pA > implies that at least 2^ /nP among the subset sums 
are exactly (M — x)/2. This overwhelming collision suggests that A must have some strong 
additive structure. Tao and the second author proposed 

Inverse Principle: 

A set with large small hall probability must have strong additive structure. (9) 
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The issue is, of course, to quantify the statement. Before attacking this question, let us 
recall the famous Preiman's inverse theorem from Additive Combinatorics. As the readers 
will see, this theorem strongly motivates our study. 

In the 1970s, Freiman considered the collection of pairwise sums A + A := {a + a'\a,a' G A} 
|15j . Normally, one expects this collection to have elements. Freiman proved a deep 

and powerful theorem showing that if j4 + A has only 0(|A|) elements (i.e, a huge number 
of collision occurs) then A must look like an arithmetic progression. (Notice that if A is an 
arithmetic progression then + ~2|A|.) 

To make Freiman's statement more precise, we need the definition of generalized arithmetic 
progressions (GAPs). 

Definition 7.2. A set Q is a GAP of rank r if it can be expressed in the form 

Q = {90 + 'nT'iOi + • • • + mrgr\Mi < rrii < M[, rrij G Z for all 1 < i < r} 
for some go,..., gr, Mi, . . . , Mr, M[, . . . , M^. 

It is convenient to think of Q as the image of an integer box B := {(mi, . . . , rur) £ Z'"|Mj < 
TTT-i < M-} under the linear map 

$ : (mi, . . . , rUr) ^ go + niigi H h rurgr. 

The numbers gi are the generators of P, the numbers M-,Mi are the dimensions of P, 
and Vol((3) := \B\ is the volume of B. We say that Q is proper if this map is one to one, 
or equivalently if \Q\ = Yol{Q). For non-proper GAPs, we of course have \Q\ < Vol{Q). If 
—Mi = Mj' for alH > 1 and go = 0, we say that Q is symmetric. 

If Q is symmetric and t > 0, the dilate tQ is the set 

{mi5i H h rurgrl - tM^ < rui < tM[ for all 1 < i < r}. 

It is easy to see that if Q is a proper map of rank r, then IQ + Q| < 2''|(5|. This implies 
that if ^ is a subset of density (5 in a proper GAP Q of rank r, then as far as 5 = 0(1), 

l^ + ^l < \Q + Q\ < 2nQ| < =0(|^|). 



Thus, dense subsets of a proper GAP of constant rank satisfies the assumption |^ + ^| = 
0(|A|). Freiman's remarkable inverse theorem showed that this example is the only one. 

Theorem 7.3 (Freiman's inverse theorem in Z). Let ^ he a given positive number. Let X 
he a set in Z such that \X + X\ < ^\X\. Then there exists a proper GAP of rank Oy{l) and 
cardinality 0^{\X\) that contains X . 

For further discussions, including a beautiful proof by Ruzsa, see [671 Chapter 5]; see also 
[5] for recent and deep developments concerning non-cummutative settings (when ^4 is a 
subset of a non-abelian group). 
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In our case, we want to find examples for A such that p{A) := sup^P(S^ = x) is large. 
Again, dense subsets of a proper GAP come in as natural candidates. 

Example 7.4. Let Q be a proper symmetric GAP of rank r and volume N . Let ai, . . . ,an 

be (not necessarily distinct) elements of Q. By the Central Limit Theorem, with probability 
at least 2/3, the random sum Sa = X]"=i ^j^j takes value in the dilate lOn^/^Q. Since 
\tQ\ < f^N, by the pigeon hole principle, we can conclude that there is a point x where 



Thus if \Q\ = N = 0(n'^~'~/^) for some constant C > r/2, then 



p{A) > p{Sa = x) = 

This example shows that if the elements of A are elements of a symmetric proper GAP with 
a small rank and small cardinality, then p{A) is large. Inspired by Freiman's theorem, Tao 
and the second author |62l [60] showed that the converse is also true. 

Theorem 7.5. For any constant C, e there are constants r, B such that the following holds. 
Let A be a multi-set of n real numbers such that p{A) > n^^ , then there is a GAP Q of 
rank r and volume such that all but elements of A belong to Q. 



The dependence of -B on C, e is not explicit in [60j . In |62j , Tao and the second author 



obtained an almost sharp dependence. The best dependence, which mirrors Example 7.4 
was proved in a more recent paper |39j of the current authors. This proof is different from 
those in earlier proofs and made a direct use of Freiman's theorem (see Appendix [A|) . 

Theorem 7.6 (Optimal inverse Littlewood-Offord theorem, discrete case). [39j Let e < 1 
and C be positive constants. Assume that 



p{A) > n 



-c 



Then there exists a proper symmetric GAP Q of rank r = Oc,e{^) which contains all but at 
most en elements of A (counting multiplicities), where 

|Q| = Oc,.(p(^)-'n-i). 



The existence of the exceptional set cannot be avoided completely. For more discussions, see 
[601 139] , There is also a trade-off between the size of the exceptional set and the bound on 
\Q\. In many combinatorial applications (see, for instance, the next section), an exceptional 
set of size en does not create any trouble. 

Let us also point out that the above inverse theorems hold in a very general setting where 
the random variables are not necessarily Bernoulli and independent (see [601 E21 EHl [38] 
for more details). 
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8. Application: From Inverse to Forward 



One can use the "inverse" Theorem 7.6 to quickly prove several "forward" theorems pre- 



sented in earlier sections. As an example, let us derive Theorems 3.1 and 3.3 



Proof, (of Theorem |3.1[ ) Assume, for contradiction, that there is a set ^ of n non-zero 
numbers such that p{A ) > cin~^/^ for some large constant ci to be chosen. Set e = .1, C 
1/2. By Theorem 
least .9n elements 



7.6| there is a GAP Q of rank r and size 0{^n'^ 2) that contains at 



rom A. However, by setting ci to be sufficiently large (compared to the 



constant in big O) and using the fact that C 
Thus, Q has to be empty, a contradiction. 



1/2 and r > 1, we can force 0(^ 



n 



c- 



< 1. 

□ 



Proof, (of Theorem 3.3) Similarly, assume that there is a set A of n distinct numbers such 
that p(A ) > c\n~^l'^ for some large constant c\ to be chosen. Set e = .1,C = 3/2. By 
there is a GAP Q of rank r and size Oi^^rf"' 



Theorem 



7.6 



that contains at least .9n 
elements from A. This implies \Q\ > .9n. By setting ci to be sufficiently large and using 
the fact that C = 3/2 and r > 1, we can guarantee that \Q\ < .8n, a contradiction. □ 



The readers are invited to work out the proof of Corollary 6.3 



Let us now consider another application of Theorem 7.6 which enables us to make very 
precise counting arguments. Assume that we would like to count the number of multi-sets 
A of integers with max |aj| < M = n*-^^^) such that p{A) > n~ 



-c 



Fix d > 1, fixj^a GAP Q with rank r and volume \Q\ < cp{A)~^n~'2 for some constant 
c depending on C and e. The dominating term in the calculation will be the number of 
multi-sets which intersect with Q in subsets of size at least (1 — e)n. This number is bounded 

by 



|Q|"-'=(2M)'= < ^(c/j(A)-in-i)"-'=(2M)^ 



(10) 



k<en 



k<en 



< (Oc,.(l))"n^^(i)Xyl)-"n-t. 



We thus obtain the following useful result. 

Theorem 8.1 (Counting theorem: Discrete case). The number N of multi-sets A of integers 
with max|aj| < n'~''^ and p{A) > n^*^^ is bounded by 



N={Oc,,ca^)rnO^'-'^-{p{A)-' 
where e is an arbitrary constant between and 1. 



n 



-1/2 



A more detailed version of Theorem 



7.6 



1 tells us that there are not too many ways to choose the generators 
of Q. In particular, if \ai\ < M = nP'^^^ the number of ways to fix these is negligible compared to the main 
term. 
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Due to their asymptotic nature, our inverse theorems do not directly imply Stanley's precise 
result (Theorem 3.4). However, by refining the proofs, one can actually get very close and 
with some bonus, namely, additional strong rigidity information. For instance, in |37j the 
first author showed that if the elements of A are distinct, then 



PiSA = x)<J- + oil))n~^/^ 
V vr 

where the constant on the RHS is obtained when A is the symmetric arithmetic progression 



Aq from Theorem 3.4, It was showed that if p{A) is close to this value, then A needs to be 




very close to a symmetric arithmetic progression. 

Theorem 8.2. [37] There exists a positive constant eo such that for any < e < eo, there 
exists a positive number e' = e'(e) such that e' — )• as e — )• and the following holds: if A 
is a set of n distinct integers and 



, 24 

V V TT 

then there exists an integer I which divides all a €z A and 

We remark that a slightly weaker stability can be shown even when we have a much weaker 
assumption p{A) > en^/^. 

As the reader will see, in many applications in the following sections, we do not use the 



inverse theorems directly, but rather their counting corollaries, such as Theorem 8.1 Such 
counting results can be used to bound the probability of a bad event through the union 
bound (they count the number of terms in the union). This method was first used in studies 
of random matrices \57\ l60l H5] , but it is simpler to illustrate the idea by the following more 
recent result of Conlon, Fox, and Sudakov [6]. 

A Hilbert cube is a set of the form xq + S({xi, . . . , Xd}) where S(X) = {J2xeY ■^}^ 
and < xo,0 < < ••• < Xd are integers. Following the literature, we refer to the 
index d as the dimension. One of the earliest results in Ramsey theory is a theorem of 
Hilbert [22] stating that for any fixed r and d and n sufficiently large, any coloring of the 
set [n] := {1, . . . , n} with r colors must contain a monochromatic Hilbert cube of dimension 
d. Let h{d, r) be the smallest such n. The best known upper bound for this function is 
[221 [20] 



h{d,r) < (2r)2' \ 

The density version of |55j states that for any natural number d and 5 > there exists an no 
such that if n > no then any subset of n of density 6 contains a Hilbert cube of dimension 
d. One can show that 
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d > c log log n 
where c is a positive constant depending only on 5. 



On the other hand, Hegyvari shows an upper bound of the form 0{^/^ogr^^ogTogn) by 
considering a random subset of density 6. Using the discrete inverse theorems (Section [7]), 
Conlon, Fox, and Sudakov [6] removed the log log n term, obtaining 0{^/logn), which is 
sharp up to the constant in big O, thanks to another result of Hegyvari. 



Conlon et. al. started with the following corollary of Theorem 7.5 



Lemma 8.3. For every C>0, l>e>0 there exist positive constants r and C such that 
if X is a multiset with d elements and \Ti{X)\ < d^ , then there is a GAP Q of dimension 
r and volume at most d^ such that all but at most d^"^ elements of X are contained in Q. 

From this, one can easily prove the following counting lemma. 

Lemma 8.4. For s < logd, the number of d- sets X C [n] with < 2^d'^ is at most 

^0{s)^0{d) _ 



Let j4 be a random set of [n] obtained by choosing each number with probability 5 inde- 
pendently. Let E be the event that A contains a Hilbert cube of dimension C\J\og n. We 
aim to show that 



P{E) = o{l), (11) 

given c sufficiently large. 

Trivially F{E) < "n-X^xcH '^'^^''^^'> where the factor n corresponds to the number of ways 
to choose xq. Let mt be the number of X such that |$](X)| = t. The RHS can be bounded 
from above by n m^fj*. 

If t is large, say t > d^, we just crudely bound X^o^s m-t by n"^ (which is the total number 
of ways to choose xi, . . . ,Xd)- The contribution in probability in this case is at most n x 
^ ^d^ _ if c is sufficiently large. In the case t < d^, we make use of the counting 
lemma above to bound mt and a routine calculation finishes the job. 



9. Inverse Theorems: Continuous case I. 



In this section and the next, we consider sets with large small probability. 

We say that a vector v G R"^ is 6 -close to a set Q C R*^ if there exists a vector q ^ Q such 
that \\v — q\\2 < 6. A set X is J-close to a set Q if every element of X is 5-close to Q. The 



continuous analogue of Example 7.4 is the following. 



Example 9.1. Let Q be a proper symmetric GAP of rank r and volume N in R'^. Let 
ai,...,a„ be (not necessarily distinct) vectors which are jj^f3n^^^'^ -close to Q. Again by 
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the Central Limit Theorem, with probability at least 2/3, Sa is (i-close to lOn^/^Q. Thus, 
by the pigeon hole principle, there is a point x in lOOn^/^Q such that 



P(5a G B{x,P)) > |10ni/2g|-i > j7(n-'-/2|g|-i). 
It follows that if Q has cardinality n*^^? for some constant C > r/2, then 



Pdfi,BeM) = (12) 



Thus, in view of the Inverse Principle Q and Theorem 7.6 , we would expect that if 



Pd,[5,Ber{A) is large, then most of the Oj is close to a GAP with small volume. This statement 
turned out to hold for very general random variable ^ (not only for Bernoulli). In practice, 
we can consider any real random variable ^, which satisfies the following condition: there 
are positive constants Ci,C2,C3 such that 



P(Ci < 16-61 <C2) >C3, (13) 



where 6)6 are iid copies of 6 



Theorem 9.2. [39j Let ^ be a real random variable satisfying (13). Let 0<e<l;0<C be 
constants and P > be a parameter that may depend on n. Suppose that A = {ai, . . . , On} is 
a (multi-) subset oflV^ such that X^iLi ll^illi — ^ ^"-^ ^^'^^ ^ large small ball probability 

Then there exists a symmetric proper GAP Q of constant rank r > d and of size \Q\ = 
0{p~^rS~'^^'^^/'^) such that all but en elements of A are are 0{ ^^"f^ )-close to Q. 

The next result gives more information about Q, but with a weaker approximation. 

Theorem 9.3. Under the assumption of the above theorem, the following holds. For any 
number n' between and n, there exists a proper symmetric GAP Q = {X]i=i ^i9i '■ — 
Li} such that 

• At least n — n' elements of A are /3-close to Q. 

• Q has small rank, r = 0(1), and small cardinality 



IQI <max(^0(^),iy 



• There is a non-zero integer p = 0{yn') such that all steps Qi of Q have the form 
9i = {gn, ■ ■ ■,9id), where gij = (3^ with pij G Z and pij = 0{l3~^VnJ). 

Theorem |9.3| immediately implies the following result which can be seen as a continuous 
analogue of Theorem |8.1[ This result was first proved by Tao and the second author for 
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the purpose of verifying the Circular Law in random matrix theory \58\ [60] using a more 
compUcated argument. 

Let n be a positive integer and /3, p be positive numbers that may depend on n. Let Sn,/3,p 
be the collection of all multisets A = {ai, . . . ,an},ai E such that J2i=i ll^dli — ^ 

Pd,l3,BeriA) > p. 

Theorem 9.4 (Counting theorem, continuous case). \5S\ l6Uj Let < e < 1/3 and C > 
be constants. Then, for all sufficiently large n and f3 > exp(— n*^) and p > n~'^ there is a 
set S C (R^)" of size at most 



p-"n-"(5-^) +exp(o(n)) 

such that for any A = {oi, . . . , a„} G <Sn,/3,p there is some A' = (a'^, . . . , o^) G S such that 
— o!j\\2 < /3 for all i. 



Proof, (of Theorem 9.4) Set n' := n^~^ (which is n*^ as e < 1/3). Let S' be the collection 
of all subsets of size at least n — n' of GAPs whose parameters satisfy the conclusion of 
Theorem 19. 3[ 



Since each GAP is determined by its generators and dimensions, the number of such GAPs 
is bounded by ((/3-i\/n')\/n')^(i)(^)^(i) = exp(o(n)). (The term (^)°^^^ bounds the 
number of choices of the dimensions M,-.) Thus 



|5'| = (o((^)") + l)exp(o(n)). 



We approximate each of the exceptional elements by a lattice point in /?• (Z/d) . Thus if we 
let S" to be the set of these approximated tuples then < X]j<„/(0(/?~^))* = exp(o(n)) 
(here we used the assumption /3 > exp(— n*^)). 

Set S := S' X S" . It is easy to see that |5| < 0(n^^/^+'^/9~-'^)" + exp(o(n)). Furthermore, if 
p{A) > n'"^^^^ then A is /3-close to an element of S, concluding the proof. □ 



10. Inverse theorems: Continuous case II. 

Another realization of the Inverse Principle ^ was given by Rudelson and Vershynin in 
[15| HJ] (see also Friedland and Sodin [l6]). Let ai, . . . , a„ be real numbers. Rudelson and 
Vershynin defined the essential least common denominator (LCD) of a = (ai, . . . ,a„) as 
follows. Fix parameters a and 7, where 7 G (0, 1), and define 

LCD„,^(a) := inf |6l > : dist(6la,Z") < min(7||6la||2, a)}. 



SMALL BALL PROBABILITY, INVERSE THEOREMS, AND APPLICATIONS 



21 



The requirement that the distance is smaller than 7||^a||2 forces us to consider only non- 
trivial integer points as approximations of Oa. One typically assume 7 to be a small constant, 
and a = Cy/n with a small constant c > 0. The inequality dist(0a, Z") < a then yields that 
most coordinates of Oa are within a small distance from non-zero integers. 

Theorem 10.1 (Diophatine approximation). |451 l46j Consider a sequence A = {ai, . . . , an} 
of real numbers which satisfies X]r=i — (, be a random variable such that sup^ P(i^ G 

B{a, 1)) < 1 — 6 for some b > 0, and xi, . . . ,Xn be iid copies of ^. Then, for every a > 
and 7 G (0, 1), and for 

f>> ' 



we have 



LCD„,^(a)' 

p,,,,,{A) < ^ + Ce-^^-' 
7V0 



One can use Theorem 10.1 to prove a special case of the forward result of Erdos and 
Littlewood-Offord when most of the Oj have the same order of magnitude (see p. 6]). 
Indeed, assume that Ki < \ai\ < K2 for all i, where K2 = cKi with c = 0(1). Set 

a[ := Oi I ^^"^ ^ '■— ('^iJ • • • i^'n)- Choose 7 = ci,a = C2^/n with sufficiently small 
positive constants ci,C2 (depending on c), the condition dist(0a',Z") < min(7||0a'||2, a) 
implies that \6a'^ — ni\ < 1/3 with Jij G Z,nj 7^ for at least csn indices i, where C3 is a 
positive constant depending on ci, C2 . It then follows that for these indices, O'^a'^^ > Anf/9. 
Summing over i, we obtain 9^ = Q{n) and so LCDQ^^(a') = Q{^/n). Applying Theorem 



10.1 to the vector a' with (3 = l/LCDa,'y(a'), we obtain the desired upper bound 0{l/^/n) 



for the concentration probability. 



Theorems 10.1 is not exactly inverse in the Freiman sense. On the other hand, it is conve- 
nient to use and in most applications provides a sufficient amount of structural information 
that allows one derive a counting theorem. An extra advantage here is that this theorem 
enables one to consider sets A with small ball probability as small as (1 — e)", rather than 



just n *^ as in Theorem 9.2 



The definition of the essential least common denominator above can be extended naturally 
to higher dimensions. To this end, we define the product of such multi- vector a and a vector 
G R'^ as 



e-a=i{e,a,),--- ,(e,a„)) GR". 

Then we define, for a > and 7 G (0, 1), 

LCD^,^(a) := inf |||0||2 : 9 G R'^, dist(0 • a, Z^) < min(7||0 • a||2, a)}. 



The following generalization of Theorem 10.1 gives a bound on the small ball probability 



for the random sum J2i=i ^i^i ™ terms of the additive structure of the coefficient sequence 
a. 



^One can also handle this case by conditioning on the abnormal at and use Berry-Esseen for the remaining 
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Theorem 10.2 (Diophatine approximation, multi-dimensional case). |46t [T6] Consider a 
sequence A = {ai, . . . , a^} of vectors ai G R'^, which satisfies 

n 

^^(aj,x)^ > ||x||2 for every x G R'^. (14) 

i=l 

Let ^ be a random variable such that supaP(^ G B{a,l)) < 1 — b for some b > and 
xi, . . . ,Xn be iid copies of ^. 

Then, for every a > and 7 G (0, 1), and for 

^ - LCD<,,^(a)' 

we have 



We will sketch the proof of Theorem 10.1 in Appendix [B] 



11. Inverse quadratic Littlewood-Offord 

In this section, we revisit the quadratic Littlewood-Offord bound in Section |4] and consider 
its inverse. We first consider a few examples of A where (the quadratic small probability) 
Pq{A) is large. 

Example 11.1 (Additive structure implies large small ball probability). Let Q be a proper 
symmetric GAP of rank r = 0(1) and of size n'^^^K Assume that Oij G Q, then for any 

C^ G {±1} 

'^aij^i^j G n^Q. 

Thus, by the pegion-hole principle. 



Pg(^)>n-2nQr^ = n-^(i). 

But unlike the linear case, additive structure is not the only source for large small ball 
probability. Our next example shows that algebra also plays a role. 

Example 11.2 (Algebraic structure implies large small ball probability). Assume that 



where ki£Z,\ki\= 71*^(1) and such that PC^^ h^i = 0) = n-^^^\ 
Then we have 
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P( <^^MJ = 0) = P(^ kiCi bjij = 0) = n-«(i). 

Combining the above two examples, we have the fohowing general one. 

Example 11.3 (Structure implies large small ball probability). Assume that aij 
where a[j & Q, a proper symmetric GAP of rank 0(1) and size n'^^^\ and 

Ojj^j = ki\b\j -\- kj\b\i ~t~ * * * ~t~ ki's'hfj -\- kjj^b^i^ 

where bu, . . . ,bri are arbitrary and kn, . . . , kir are integers bounded by n'-'^^\ and r = 0(1) 
such that 



4+4' 



Then we have 



n 



-0(i) 



Thus, 



P(^«ue*ei en2Q) = n-0«. 
It then follows, by the pigeon-hole principle, that Pq{A) = n~^^^\ 



We have demonstrated the fact that as long as most of the aij can be decomposed as 
aij = a[j + a'-j, where a'^j belongs to a GAP of rank 0(1) and size n'-'^^^ and the symmetric 
matrix (a'lj) has rank 0(1), then A = {aij) has large quadratic small ball probability. The 
first author in [36j showed that sort of the converse is also true. 

Theorem 11.4 (Inverse Littlewood-Offord theorem for quadratic forms). Let e < 1,C be 
positive constants. Assume that 



n 



Then there exist index sets Iq, I of size Oc^e(l) and n — Oc{n'') respectively, with IDIq = 0, 
and there exist integers ka^ (for any pair i^ £ lQ,i €z I) of size bounded by vP'^^'^^\ and a 
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structured set Q of the form 

Q={Y1 -9h\Ph e Z, \pr,l\qh\ = rficA^^], 



such that for alii ^ I the followings holds: 

• (low rank decomposition) for any j G /, 

Oij = a^j — ( y ^ kii^aigj + y ^ kjigOiQi)', 

• (common additive structure of small size) all but Oc{n^) entries a'^^ belong to Q. 

We remark that the common structure Q is not yet a GAP, as the coefficients are rational, 
instead of being integers. It is desirable to have an analogue of Theorem |7.6| with common 
structure as a genuine GAP with optimal parameters (see for instance [7, Conjecture 1] 
for a precise conjecture for bilinear forms.) For counting purposes, this inverse theorem is 
sufficiently strong. 

12. Application: The least singular value of a random matrix 

For a matrix A, let crn{A) denote its smallest singular value. It is well known that an{A) > 
and the bound is strict if and only if A is non-singular. An important problem with many 
practical applications is to bound the least singular value of a non-singular matrix (see 
[T71 [52l [53l [63t Wl\ [9] for discussions) . The problem of estimating the least singular value 
of a random matrix was first raised by Goldstine and von Neumann [TTj in the 1940s, with 
connection to their investigation of the complexity of inverting a matrix. 

To answer Goldstine and von Neumman's question, Edelman [9j computed the distribution 
of the LSV of the random matrix M^°'^ of size n with iid standard gaussian entries, and 
showed that for all fixed t > 



P(a„(M„^- < tn-i/2) = /* l+^g-(./2+v^) ^ ^^1^ = t-lt^ + 0{t^) + 0(1). 

He conjectured that this distribution is universal (i.e., it must hold for other models of 
random matrices, such as Bernoulli). 

More recently, in their study of smoothed analysis of the simplex method, Spielman and 
Teng [52l [53] showed that for any t > ( t can go to with n) 

P((T„(M„^"") < tn-^''^) < t. (15) 

They conjectured that a slightly adjusted bound holds in the Bernoulli case [52] 

PianiM^'') <t)< tn^'^ + c", (16) 
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where < c < 1 is a constant. The term c" is needed as M^"' can be singular with 
exponentially small probability. 

Edelman's conjecture has been proved by Tao and the second author in [64J. This work 
also confirms Spielman and Teng's conjecture for the case t is fairly large; t > for some 
small constant 5 > 0. For t > n^'^/^, Rudelson in [33], making use of Halasz' machinery 
from [21], obtained a strong bound with an extra (multiplicative) constant factor. In many 
applications, it is important to be able to treat even smaller t. As a matter of fact, in 
applications what one usually needs is the probability bound to be very small, but this 
requires one to set t very small automatically. 

In the last few years, thanks to the development of inverse theorems, one can now prove 
very strong bound for almost all range of t. 

Consider a matrix M with row vectors Xi and singular values cri > • • • > 0"^. Let di be 
the distance from Xi to the hyperplane formed by the other n — 1 rows. There are several 
ways to exhibit a direct relation between the di and cjj. For instance, Tao and the second 
showed [58j 



df + --- + d-'' = a^^ + ■■■ + a-\ (17) 

A technical relation, but in certain applications more effective, is [451 Lemma 3.5]. 

From this, it is clear that if one can bound the di from below with high probability, then 
one can do the same for cr^. Let v = (oi, . . . , a„) be the normal vector of the hyperplane 
formed by X2 , . • . , Xn and ^1 , . . . , ^„ be the coordinates of Xi , then 



|ai^i + . . . anin\ 



Thus, the probability that di is small is exactly the small probability for the multi-set 
A = {ai, . . . , a-n}- If this probability is large, then the inverse theorems tell us that the set 
A must have strong additive structure. However, A comes as the normal vector of a random 
hyperplane, so the probability that it has any special structure is very small (to quantify 



this we can use the counting theorems such as Theorem 9.4). Thus, we obtain, with high 
probability, a lower bound on all di. In principle, one can use this to deduce a lower bound 
for an- 

Carrying out the above plan requires certain extra ideas and some careful analysis. In [60] , 
Tao and the second author managed to prove 

Theorem 12.1. For any constant ^ > 0, there is a constant B > such that 



P(c7„(M„^-) < n-^) < n- 
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The first inverse theorem, Theorem 7.5 was first proved in this paper, as a step in the proof 
of Theorem 12.1 In a consequent paper, Rudelson and Vershynin developed Theorem |10.1[ 
and used it, in combination with [IF, Lemma 3.5] and many other ideas to show 

Theorem 12.2. There is a constant C > and < c < 1 such that for any t > 0, 

P{an{M^'') < < ^^1/2 ^ 

This bound is sharp, up to the constant C. It also gives a new proof of Kahn-Komlos- 
Szemeredi bound on the singularity probability of a random Bernoulli matrix (see Section 



13). Both theorems hold in more general setting. 



In practice, one often works with random matrices of the type A + Mn where A is determin- 
istic and Mfi has iid entries. (For instance, in their works on smoothed analysis, Spielman 
and Teng used this to model a large data matrix perturbed by random noise.) They proved 
in [52] 



Theorem 12.3. Let A be an arbitrary n by n matrix. Then for any t > 0, 



One may ask whether there is an analogue of Theorem 12.2 for this model. The answer 
is, somewhat surprisingly, negative. An analogue of the weaker Theorem 12.1 is, however, 
available, assuming that ||^|| is bounded polynomially in n. For more discussion on this 
model, we refer to |63j . For applications in Random Matrix Theory (such as the establish- 
ment of the Circular Law) and many related results, we refer to [591 ESI ISHl [181 HOI [2l HZ] 
and the references therein. 



13. Application: Strong bounds on the singularity problem-the 

non-symmetric case 

We continue to discuss the singularity problem from Section [5j The first exponential bound 
on pn was proved by Kahn, Komlos and Szemeredi [23] , who showed that p„ < .999". In 
[55] , Tao and the second author simplified the proof and got a slightly improved bound 
.952". A more notable improvement which pushed the bound to (3/4 + o(l))" was obtained 
in a subsequent paper [57], which combined Kahn et. al. approach with an inverse theorem. 
The best current bound is (l/\/2 + o(l))" by Bourgain, Vu and Wood [3]. The proof of this 
bound still relied heavily on the approach from [57j (in particular it used the same inverse 
theorem), but added a new twist which made the first part of the argument more effective. 

In the following, we tried to present the approach from [23j and ^57]. Similar to the proof 
in Appendix [A] we first embed the problem in a finite field F = Fp, where p is a very large 
prime. Let { — 1, 1}" C F" be the discrete unit cube in F". We let X be the random variable 
taking values in {—1,1}" which is distributed uniformly on this cube (thus each element 
of {—1, 1}" is attained with probability 2~"). Let Xi, . . . ,X„ G {—1, 1} be n independent 
samples of X. Then 

Pn := P(-^i, • • . , Xn linearly dependent). 
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For each linear subspace V of F", let Ay denote the event that Xi, . . . , Xn span V. Let us 
call a space V non-trivial if it is spanned by the set V H {—1,1}"'. Note that P{Av) / 
if and only if V is non-trivial. Since every collection of n linearly dependent vectors in F" 
will span exactly one proper subspace V of F", we have 

Pn= Yl (18) 

V a proper non-trivial subspace of F" 
It is not hard to show that the dominant contribution to this sum came from the hyperplanes: 

V a non-trivial hyperplane in F" 
Thus, if one wants to show pn < (3/4 + o(l))", it suffices to show 

Y P{Av)<{3/A + o{l)r. 
V a non-trivial hyperplane in F" 

The next step is to partition the non-trivial hyperplanes V into a number of classes, de- 
pending on the number of ( — 1,1) vectors in V. 

Definition 13.1 (Combinatorial dimension). Let D := {d± G Z/n ■ 1 < d± < n}. For any 

d± £ D, we define the combinatorial Grassmannian Gr{d±) to be the set of all non-trivial 
hyperplanes V in F" with 

2'^±-i/" < |l/n{-i,in < 2'^±. (19) 

We will refer to d± as the combinatorial dimension of V. 
It thus suffices to show that 

Y Y P(^y)<(^ + o(i))". (20) 

d±GDV£Gr(d±) 

It is therefore of interest to understand the size of the combinatorial Grassmannians Gr(d-i-) 
and of the probability of the events Ay for hyperplanes V in those Grassmannians. 

There are two easy cases, one when d± is fairly small and one where d± is fairly large. 
Lemma 13.2 (Small combinatorial dimension estimate). Let < a < 1 be arbitrary. Then 

Y Y P(^v)<n«". 

d±e-D:2'^±-"<a" VeGr{d±) 



Proof, (of Lemma 13.2) Observe that if Xi, . . . ,Xn span V, then there are n — 1 vectors 



among the Xi which already span V. By symmetry, we thus have 



P(^V') = P(^i, ■■■,Xn span V) < nP{Xi, . . .,Xn-i span V)P{X £ V). (21) 
On the other hand, if V e Gr{d±) and 2'^±-" < a", then P{X e V) < a" thanks to 
Thus we have 

P(Ay) < na"P(Xi, . . . , Xn-i span V). 
Since Xi, . . . , X^-i can span at most one space V, the claim follows. □ 
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Lemma 13.3 (Large combinatorial dimension estimate). We have 



E F{Av)<{l + o{l))n'2'-. 
d±er>:2'*±-">ioo/v^ veGr{d±) 



This proof uses Theorem 3.1 and is left as an exercise; consult [231 [57] for details. The heart 



of the matter is the following, somewhat more difficult, result. 

Proposition 13.4 (Medium combinatorial dimension estimate). Let < eo <C 1, and let 

d±e D be such that (| + 2eo)" < 2'^±"" < Then we have 

5] P{Av)<o{ir, 

VeGr{d±) 

where the rate of decay in the o(l) quantity depends on eo (but not on d±). 

Note that D has cardinality = O( n^). T hus if we combine this proposition with Lemma 

we see that we can bound the left-hand side of 



13.2 



fofby 



(with a := 4 + 2eo) and Lemma 



13.3 



n(^ + 2eo)" + n^l)" + (1 + o{l))n^2-^ = (^ + 2eo + o{l)Y 



Since eo is arbitrary, the upper bound (3/4 + o(l))" follows. 



We now informally discuss the proof of Proposition [13.4 We start with the trivial bound 



E P(^V') < 1 (22) 

VeGr((i±) 

that arises simply because any vectors Xi, . . . , X„ can span at most one space V . To improve 
upon this trivial bound, the key innovation in |23j is to replace X by another random variable 
Y which tends to be more concentrated on subspaces V than X is. Roughly speaking, one 
seeks the property 

P(X eV)< cP(y G V) (23) 

for some absolute constant < c < 1 and for all (or almost all) subspaces V G GT{d±). 
From this property, one expects (heuristically, at least) 

P(^y) = P(Xi, . . . span V) < c"P(yi, . . . ,y„ span V), (24) 



where Yi, . . . , 1^ are iid samples of Y , and then by applying the trivial bound (22 ) with Y 
instead of X, we would then obtain a bound of the form 'Yl,v&Gr(d±) f*(^v) < c", at least 
in principle. Clearly, it will be desirable to make c as small as possible; if we can make c 



arbitrarily small, we will have established Proposition 13.4 



The random variable Y can be described as follows. Let < ^ < 1 be a small absolute 
constant (in |23j the value /i = was chosen), and let r/^^^^ be a random variable 

taking values in { — 1,0,1} C F which equals with probability 1 — /U and equals +1 or —1 
with probability /i/2 each. Then let Y := {r][^\ ■ ■ ■ , r]^^) G -F", where r][^\ . . . , r]^^ are iid 
samples of r]^^\ By using a Fourier-analytic argument of Halasz [21j, a bound of the form 



P{X eV)< C^P{Y G V) 



SMALL BALL PROBABILITY, INVERSE THEOREMS, AND APPLICATIONS 



29 



was shown in [23J, where C was an absolute constant (independent of n), and V was a 
hyperplane which was non- degenerate in the sense that its combinatorial dimension was 



not too close to n. For /i sufficiently small, one then obtains (23) for some < c < 1 



although one cannot make c arbitrarily small without shrinking /i also. 

There are however some technical difficulties with this approach, arising when one tries to 



pass from (23) to (24). The first problem is that the random variable Y, when conditioned 
on the event Y G V, may concentrate on a lower dimensional subspace on V, making it 
unlikely that Yi, . . . , y„ will span V. In particular, Y has a probability of (1 — /i)" of being 



the zero vector, which basically means that one cannot hope to exploit (23) in any non- 
trivial way once P(X G V) < {1 — ^)". However, in this case V has very low combinatorial 
dimension and Lemma [13. 2| already gives an exponential gain. 

Even when (1 — /Li)" < P(X £ V) < 1, it turns out that it is still not particularly easy to 



obtain (24), but one can obtain an acceptable substitute for this estimate by only replacing 
some of the Xj by Yj. Specifically, one can try to obtain an estimate roughly of the form 

P(Xi, . . . , X„ span V) < c'"P(yi, . . . , F^, Xi, . . . , span V) (25) 

where m is equal to a suitably small multiple of n (we will eventually take m ~ n/100). 
Strictly speaking, we will also have to absorb an additional "entropy" loss of (^) for technical 
reasons, though as we will be taking c arbitrarily small, this loss will ultimately be irrelevant. 



The above approach (with some minor modifications) was carried out rigorously in |23| 
to give the bound pn = 0{.999^) which has been improved slightly to 0(.952") in [56], 
thanks to some simplifications. There are two main reasons why the final gain in the base 
was relatively small. Firstly, the chosen value of /i was small (so the n(l — error was 
sizeable), and secondly the value of c obtained was relatively large (so the gain of c" or 
g(i-7)" -^^as relatively weak). Unfortunately, increasing fi also causes c to increase, and so 
even after optimizing and c one falls well short of the conjectured bound. 

The more significant improvement to (3/4 + o(l))" relies on an inverse theorem. To reduce 
all the other losses to (| + 2eo)" for some small eo) we increase up to 1/4 — eo/100, at 



which point the arguments of Halasz and j23| I56| give (23) with c = 1. The value 1/4 for /j, 



is optimal as it is the largest number satisfying the pointwise inequality 

I cos(rE)| < (1 — ^) + /Ucos(2x) for all x G R, 



which is the Fourier-analytic analogue of (23) (with c = 1). At first glance, the fact that 



c = 1 seems to remove any utility to ( 23 ) , as the above argument relied on obtaining gains 
of the form c" or c^^""^^". However, we can proceed further by subdividing the collection of 
hyperplanes Gr{d±) into two classes, namely the unexceptional spaces V for which 

P{X £V) < eiP(y G V) 

for some small constant < ei <C 1 to be chosen later (it will be much smaller than eq), 
and the exceptional spaces for which 

eiP{Y eV)< P{X eV) < P{Y G V). (26) 

The contribution of the unexceptional spaces can be dealt with by the preceding arguments 
to obtain a very small contribution (at most 6^ for any fixed 5 > given that we set 
ei = £1(7,5) suitably small), so it remains to consider the exceptional spaces V. 
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The key technical step is to show that there are very few exceptional hyperplannes (and 
thus their contribution is negligible). This can be done using the following inverse theorem 



(the way the counting Theorem 8.1 was proved using the inverse Theorem 7.6). 



Let V G Gr{d±) be an exceptional space, with a representation of the form 

y = {(xi, . . . , E F" : xioi + . . . + a;„a„ = 0} (27) 

for some elements oi, . . . , a„ G F. We shall refer to ai, . . . , a,i as the defining co-ordinates 
for V. 

Theorem 13.5. There is a constant C = C(eo, ei) such that the following holds. Let V he a 
hyperplane in Gr(d±) and ai, . . . , a„ be its defining co-ordinates. Then there exist integers 

l<r <C (28) 
and Ml, . . . , Mr > 1 with the volume bound 

Mi...Mr< C2''-'^± (29) 

and non-zero elements vi, . . . ,Vr G F such that the following holds. 

• (Defining coordinates lie in a progression) The symmetric generalized arithmetic 
progression 

P ■= {mivi + . . . + rUrVr : -Mj/2 < mj < Mj/2 for all I < j < r} 

is proper and contains all the Cj. 

• (Bounded norm) The a, have small P-norm: 

n 

Y,hjfp<C (30) 

J=l 

• (Rational commensurability) The set {vi, . . . , Vr} U {ai, . . . , a„} is contained in the 
set 

{^^;i:p,gGZ;<?/0;|p|,|(z|<n°(«)}. (31) 
14. Application: Strong bounds on the singularity problem-the symmetric 

CASE 



Similar to Conjecture 5.1, we raise 
Conjecture 14.1. 

p^^- = (l/2 + o(l))". 

We are very far from this conjecture. Currently, no exponential upper bound is known. The 
first superpolynomial bound was obtained by the first author [36] very recently. 

Theorem 14.2. [36j For any C > and n sufficiently large 
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Shortly after, Vershinyn [69j proved the following better bound 
Theorem 14.3. There exists a positive constant c such that 

= 0{eM-n')). 
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Both proofs made essential use of inverse theorems. The first author used the inverse 



quadratic Theorem 11.4 and Vershynin's proof used Theorem 10.1 several times 



In the following, we sketched the main ideas behind Theorem 14.2 Let r = (^i, . . . , be 
the first row of Mn, and Ojj, 2 < i, j < n, be the cofactors of M„_i obtained by removing r 
and r-^ from M„. We have 



det(M„) = eidet(M„„i) + 



2<i,j<n 



(32) 



Recalling the proof of Theorem 5.6 (see Section [s]). One first need to show that with 
high probability (with respect to M„_i) a good fraction of the co- factors aij are nonzero. 
Theorem |4. 1| then yields that 



Pr(det(M„) = 0) < n 



-l/8+o(l) 



o(l). 



To prove Theorem 14.2, we adapt the reversed approach, which, similar to the previous 



proofs, consists of an inverse statement and a counting step. 

(1) (Inverse step). If Pr(det(M„) = 0|M„_i) > n~^^^\ then there is a strong additive 
structure among the cofactors Oij. 

(2) (Counting step). With respect to M„_i, a strong additive structure among the Oij 
occurs with negligible probability. 



By (|32|), one notices that the first step concentrates on the study of inverse Littlewood- 

implies 



11.4 



Offord problem for quadratic forms T^ijiijiiij- Roughly speaking, Theorem 
that most of the Oij belong to a common structure. Thus, by extracting the structure on 
one row of the array A = (aij), we obtain a vector which is orthogonal to the remaining 
n — 2 rows of the matrix M„_i. Executing the argument more carefully, we obtain the 
following lemma. 

Lemma 14.4 (Inverse Step). Let e < 1 and C be positive constants. Assume that M„_i 
has rank at least n — 2 and that 



n 



-C 



Then there exists a nonzero vector u = (ui, . . . ,Un-i) with the following properties. 
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• All but n*^ elements of Ui belong to a proper symmetric generalized arithmetic pro- 
gression of rank Oc\e{^) o-nd size n'-^'^-'^-'^) . 

• Ui G {p/q : p,q GZi, \p\, \q\ = n'-^c.^in')^ Jqj. q// j_ 

• u is orthogonal to n — Oc,e{n'') rows o/M„,_i. 

Let V denote the collection of all u satisfying the properties above. For each u G "P, let 
Pu be the probability, with respect to Af„_i, that u is orthogonal to n — Oc,e{n'') rows of 
Mn-i- The following lemma takes care of our second step. 

Lemma 14.5 (Counting Step). We have 

E Pu = Oc,.((l/2)(i-°(^))"). 



The main contribution in the sum in Lemma 14.5| comes from those u which have just a few 



non-zero components (i.e. compressible vectors). For incompressible vectors, we classify it 
into dyadic classes Cp^^..._p^_-^ , where pi is at most twice and at least half the probability 
P(Ci^i + ••• + iuUi = 0). Assume that u G Cp^,...,p„_i . Then by definition, as M„_i 
is symmetric, the probability Pu is bounded by WO{pi). On the other hand, by taking 



into account the structure of generalized arithmetic progressions, a variant of Theorem 8.1 
shows that the size of each Cp^,...,p„_i is bounded by 0{pi)n~^/'^~^"^^\ Summing Pu over 
all classes C, notice that the number of these classes are negligible, one obtains an upper 
bound of order n~(^~°(^))"/^ for the compressible vectors. 

We remark that it is in the Inverse Step that we obtain the final bound on the singular 
probability. In |69j . Vershynin worked with a more general setting where one can assume a 



better bound. In this regime, he has been able to apply a variant of Theorem |10.1| to prove 
a very mild inverse-type result which is easy to be adapted for the Counting Step. As the 
details are complex, we invite the reader to consult [69j . 



15. Application: Common roots of random polynomials 
Let d be fixed. With = (ji, . . .,jd)Ji e Z+ and \jd\ = ^ji, let be iid copies of a 



random variable ^. Set x^'^ = Yl^l - Consider the random polynomial 



P{x^,...,Xd)= Yl ^j/' 
Jd,lJdl<" 

of degree n in d variables. (Here d is fixed and n — )• oo.) Random polynomials is a classical 
subject in analysis and probability and we refer to [4j for a survey. 

In this section, we consider the following natural question. Let be d -1- 1 

independent random polynomials, each have d variables and degree n. 

Question 15.1. What is the probability that Pi, ... , Pd+i have a common root ? 
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For short, let us denote the probabihty under consideration by p{n, d) 
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pin,d) :=P{3x G : Pi{x) = 0,i = 1, . . . , d + 1). 

When ^ has continuous distribution, it is obvious that p{n, d) = 0. However, the situation 
is less clear when ^ has discrete distribution, even in the case d = 1. Indeed, when n is even 
and Pi{x),P2{x) are two independent random Bernoulli polynomials of one variable, then 
one has P(Pi(l) = ^2(1) = 0) = e(l/n) and P(Pi(-l) = P2{-1) = 0) = Q{l/n). Thus in 
this case p{n, 1) = il(l/n). 

In a recent paper, Kozma and Zeitouni [32j proved p{n,d) = 0(l/n), answering Question 



15.1 in the asymptotic sense. 



Theorem 15.2. For any fixed d there exists a constant c{d) such that the following holds. 
Let Pi ... , Pd+i d+l independent random Bernoulli polynomials in d variables and degree 
n. 



p{n, d) < c{d)/n. 



In the sequel, we will focus on the case d = 1. This first case already captures some of the 
main ideas, especially the use of inverse theorems. The reader is invited to consult [32] for 
further details. 



Theorem 15.3. Let Pi,P2 be two independent Bernoulli random polynomials in one vari- 
able of degree n. Then 

0{n^^) n even 



p{n, 1) 



Notice that the bounds in both cases are sharp. To start the proof, first observe that, 
because the coefficients of Pi are ±1, all roots x of Pi have magnitude 1/2 < |x| < 2. 
Furthermore, x must be an algebraic integer. We will try to classify the common roots by 
their unique irreducible polynomial, relying on the following easy algebraic fact |32| : 

Fact 15.4. For every k there are only finitely many numbers whose irreducible polynomial 
has degree k that can be roots of a polynomial of arbitrary degree with coefficients ±1. 



Now we look at the event of having common roots. Assume that Pi is fixed (i.e. condition on 
Pi) and let xi, . . . , Xn be its n complex roots. For each Xi, we consider the probability that 
Xi is a root of P2(x). If P(P2(x,) = 0) < n'^/^ for all i, then P{3x G C : Pi(x) = P2(x)) = 
0(n^^/^), and there is nothing to prove. We now consider the case P(P2(xj) = 0) > n~^/^ 
for some root Xi of Pi(x). Notice that 



n 

P(P2(xi) = 0) = P5o,...,S„(E^J-^' = 0) = '«(^)' 

j=0 
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where X is the geometric progression X = {1, Xj, . . . , xj*}. 
Now Theorem 



7.6 



comes into play. As p{X) > most of the terms of X are additively 

correlated. On the other hand, as X is a geometric progression, this is the case only if Xj is 
a root of a bounded degree polynomial with well-controlled rational coefficients. 

Lemma 15.5. For any C > 0, there exists no such that if n > hq, and if 



p{X) > n-^, 



where X = {1, x, . . . , x"}. Then x is an algebraic number of degree at most 2C . 



Proof, (of Lemma 15.5) Set e = 1/(2C + 2). Theorem 7.6, applied to the set X, implies 
that there exists a GAP Q of rank r and size \Q\ = Oc{n^^'^'^) which contains at least 
(2C + 1) /(2C + 2)-portion of the elements of X. By pigeon-hole principle, there exists 2C + 1 
consecutive terms of X, say x*°, . . . , x*^+^*^, all of which belong to Q. 

As IQI > 1, the rank r of Q must be at most 2C. Thus there exist integral coefficients 
mi, . . . ,m2C'+i, all of which are bounded by n'-^^^^\ such that the linear combination 

^2C -I- ' 

X^j^Q ruiX^"'^'^ vanishes. In particular, it follows that x is an algebraic number of degree 
at most 2C. □ 



We now prove Theorem 15.3 Write 



p{n, 1) = P(3x G C : Pi(x) = P2ix) = 0) 

< P(Pi(l) = P2(l) = 0) + P(Pi(-l) = P2(-l) = 0) 
+ P(3x of algebraic degree 2, 3, 4, 5 : Pi(x) = P2{x) = 0) 
+ P(3x of algebraic degree > 6 : Pi(x) = ^2(2;) = 0) 
= 5i + ^2 + 53. 



For the first term, it is clear that Si = @{n ^) if n is even, and Si = otherwise. For 



the second term S2, by Lemma 15.4, the number of possible common roots x of algebraic 
degree at most 5 is 0(1), so it suffice s to s how that P(Pi(x) = P2{x)) = n~^/^ for each 
such X. On the other hand, by Lemma 15.5 we must have P(Pj(x) = 0) < n^^^'^ because x 



cannot be a rational number (i.e. algebraic number of degree one). Thus we have 



P(Pi(x) = P2ix) = 0) = P(Pi(x) = 0)P(P2(x) = 0) < 



n 



-3/2 



Lastly, in order to bound 53 we first fix Pi(x). It ha s at most n roots x of algebraic degree 

0(n-5/2)_ Thus the 



15.5 



P(P2(X) = 0) 



at least 6. For each of these roots, by Lemma 
probability that P2 has at least a common root with Pi which is an algebraic number of 
degree at least 6 is bounded by n x 
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16. Application: Littlewood-Offord type bound for multilinear forms and 

Boolean circuits 

Let A; be a fixed positive integer, and . . . , Cn) = '^se[n]<'^ '^s^s be a random multi-linear 
polynomial of degree at most k, where are iid Bernoulli variables (taking values {0,1} 
with equal probability) and ^5 = Iljes^i- mentioned in Section |4| by generalizing the 
proof of Theorem |4.1[ Costelo, Tao and the second author proved the following 

Theorem 16.1. Let K denote the number of non-zero coefficients cs, and set m := K/n^~^ . 
Then for any real number x we have 



I>{p = x) =0(m 2('=^+fe)/2 



Using a finer analysis, Razborov and Viola [12] improved the exponent ^^^2^.fc)/2 *° 

Theorem 16.2. Let p{^i, . . . , 6.n) = Yls<^[n]<'= ^5^5 be a multi-linear polynomial of degree k, 
and assume that there exist r terms $,Si i • • • > ^Sr of degree k each where the Si are mutually 
disjoint and cs^ 7^ 0. Then for any real number x we have 



F{p = x) = 0{r 

where bk = {2k2'')~^. 



One observes that r = Qim/k), where m was defined in Theorem 16.1 Indeed, assume that 
the collection {Si, . . . , Sr] is maximal (with respect to disjointness). Then every set S with 
cs 7^ 0, either has degree less than k or S intersects one of the Si. Thus K = 0{rkn^~^), 
and so r = Q.{m/k). 

It is a very interesting question (in its own right and for applications) to improve the 
exponent further. In the rest of this section, we are going to discuss Razborov and Viola's 



main application of Theorem 16.2 



For two functions /, g : {0, 1}" — t- R, one defines their correlation as 

Cor„(/, g) := P(/(6, • • • , ^n) = 5(6, • • • , Cn)) - 1/2, 

where are iid Bernoulli variables taking values {0, 1} with equal probability. 

Most of the research in Complexity Theory has so far concentrated on the case in which 
both / and g are Boolean functions (that is f{x),g{x) G {0, 1}). To incorporate into this 
framework arbitrary multivariate polynomials, one converts them to Boolean functions. 
There are two popular ways of doing this. For a polynomial p with integer coefficients, 
define a Boolean function b{x) = 1 if m\p(x), where m is a given integer, and otherwise. 
These functions b are called modular polynomials. For arbitrary p, one can set b{x) = 1 
if p{x) > t for some given threshold t, and otherwise. We refer to these functions b as 
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threshold polynomials. For further discussion on these polynomials, we refer the reader to 

[MIES]. 

It is an open problem to exhibit an explicit Boolean function /:{0,1}'^— )-{0,l} such that 
Cor„(6, /) = o{l/^/n) for any modular polynomial b whose underlying polynomial p has 
degree log2 n (see |70]). The same problem is also open for threshold polynomials. 

In |42j . Razborov and Viola initiated a similar study for the correlation of multi- variable 
polynomials where any output outside of {0, 1} is counted as an error. They highlighted 
the following problem. 

Problem 16.3. Exhibit an explicit Boolean function f such that Cor„(p, /) = o(l/-yn) for 
any real polynomial p : {0, 1}" — ?• R o/ degree log2 n. 

It is well-known that analogies between polynomial approximations and matrix approxima- 
tions are important and influential in theory and other areas like Machine Learning (see for 
instance [51]). Viewed under this angle, Razborov and Viola's model is a straightforward 
analogy of matrix rigidity [68] that still remains one of the main unresolved problems in 
the modern Complexity Theory. For further discussion and motivation, we refer to ^] and 
the references therein. It is noted that solving Problem [16. 3| is a pre-requisite for solving 
the corresponding open problem for threshold polynomials. Similarly, the special case of 



Problem 16.3 when the polynomials have integer coefficients is a pre-requisite for solving the 



corresponding open problem for modular polynomials. As a quick application of Theorem 



16.2, we demonstrate here a result addressing the question for lower degree polynomials. 



Theorem 16.4. |42l Theorem 1.2] We have Corn{p, parity) < for every sufficiently large 
n and every real polynomial p : {0, 1}" — )• R o/ degree at most log2 log2 n/2. 



Proof, (of Theorem 16.4) First we suppose that the hypothesis of Theorem 16.2 is satisfied 
with r = ^/n. Then the probability that the polynomial outputs a Boolean value is bounded 
by 



2 X 0{{l/^/n)2k2k) < 1/2, 

where k < ^ log2 log2 n. 

Otherwise, we can cover all the terms of degree k by ky/n variables. Freeze these variables 



and iterate. After at most k iterations, either the hypothesis of Theorem 16.2 is satisfied 
with r = ^/n (and with smaller degree), in which case we would be done, or else we end up 
with a degree-one polynomial with n — 0{k'^)y/n > 1 variables, in which case the statement 
is true by comparison with the parity function. □ 



17. Application: Solving Frankl and Furedi's conjecture 



In this section, we return to the discussion in Section [2] and give a proof of Conjecture 2.4 
and a new proof for Theorem 2.2 Both proofs are based on the following inverse theorem. 



SMALL BALL PROBABILITY, INVERSE THEOREMS, AND APPLICATIONS 
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Theorem 17.1. For any fixed d there is a constant C such that the following holds. Let 
A = {ai, . . . ,an} be a multi-set of vectors in R"^ such that Pd,i,Ber{^) ^ Ck^'^^'^. Then A 
is "almost" flat. Namely, there is a hyperplane H such that dist{ai, H) > 1 for at most k 
values of i = I, . . . ,n. 



The proof of this theorem combines Esseen's bound (Lemma 6.2) together with some geo- 
metric arguments. For details, see [66]; dist(a, Hi), of course, means the distance from a to 
H,. 



We first prove Theorem 2.2 by induction on the dimension d. The case d = 1 follows from 



Theorem 2.1, so we assume that d > 2 and that the claim has already been proven for 



smaller values of d. It suffices to prove the upper bound 



p{d, R, Ber, n) < (1 + o(l))2-"5(n, s). 

Fix R, and let e > be a small parameter to be chosen later. Suppose the claim failed, then 
there exists R > such that for arbitrarily large n, there exist a multi-set A = {ai, . . . , an} 
of vectors in R'^ of length at least 1 and a ball B of radius R such that 

P(5a G 5) > (l + e)2-"5(n,s). (33) 

In particular, from Stirling's approximation one has 

P{Sa G > 

Applying the pigeonhole principle, we can find a ball Bq of radius such that 



P(5AGBo)»n-i/2i^g- 

Set k := n^/^. Since d > 2 and n is large, we have 

P{Sa G Bo) > Ck-'^'^ 



for some fixed constant C. Applying Theorem 17.1 (rescaling by logn), we conclude that 
there exists a hyperplane H such that dist(vj,i/) < 1/logn for at least n — k values of 
i = 1, . . . , n. 

Let V denote the orthogonal projection to H of the vectors Vi with dist(t;j,i7) < 1/logn. 
By conditioning on the signs of all the with dist(fj,i?) > 1/logn, and then projecting 



the sum Xy onto H, we conclude from (33) the existence of a d — 1-dimensional ball B' in 
H of radius R such that 

V{Xv' G B') > (1 + e)2-'^5(n, s). 

On the other hand, the vectors in V have magnitude at least 1 — 1/ log n. If n is sufficiently 
large depending on d, e this contradicts the induction hypothesis (after rescaling the V by 
1/(1 — 1/ logn) and identifying H with R"~^ in some fashion; notice that the scaling changes 
R slightly but does not change s, and also that the function 2~"'5(n,s) is decreasing with 
n) . This concludes the proof of Q . 
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Now we turn to the proof of Conjecture |2.4[ We can assume s > 3, as the remaining 
cases have already been treated (see Section [2]) . If the conjecture failed, then there exist 
arbitrarily large n for which there exist a multi-set A = {ai, . . . ,a„} of vectors in R"^ of 
length at least 1 and a ball B of radius R such that 

P(5a e > 2-"5(n,s). (34) 

By iterating the argument used to prove (|4]), we may find a one-dimensional subspace 
L of R'^ such that dist(?;j,L) <^ 1/logn for at least n — 0{n'^^^) values of i = 1, . . . ,n. 
By reordering, we may assume that dist(fi,L) <^ 1/logn for all 1 < i < n — k, where 
k = 0{n'^/^). 

Let vr : R'^ — )• L be the orthogonal projection onto L. We divide into two cases. The first 
case is when |7r(t'j)| > ^ for all 1 < i < n. We then use the trivial bound 

Y{Sa eB)< P(5,(y) G 7t{B)). 



If we rescale Theorem 2.1 by a factor slightly less than s/R, we see that 

P(5^(y)G7r(S))<2-"5(n,s) 

which contradicts (34). 

In the second case, we assume |'/r(t;„)| < R/s. We let A' be the multi-set {ai, . . . ,a„_fc}, 
then by conditioning on the ^n-k+i: ■ ■ ■ iCn-i we conclude the existence of a unit ball B' 
such that 

P{Sa' +^nan£ B')>PiSA£ B). 

Let xb' be the center of B' . Observe that if Sy' + Cn^^n G B' (for any value of ^ri) then 
\S^(^Yr-j — 7r{xB')\ < -R + =f • Furthermore, if |5'^(y') — t^{xb')\ > VR^ — l^, then the paral- 
lelogram law shows that Sy' + dn and Syi—n cannot both lie in B', and so conditioned on 
\St^(^V') ~ '^{xb')\ > VR^ — ^, the probability that Sy' + CnOn £ B' is at most 1/2. 

We conclude that 

< P{\A^^A') - AXB')\ < VR^ - 1) + Ip{VR^-1 < \S^iV') - <^B')\ <R+-) 

s 

= ^ (PQA^iA') - A^B')\ < - 1) + P(|5,(^,) - 7t{xb')\ <R+f. 

However, note that all the elements of vr(A') have magnitude at least 1 — 1/logn. Assume, 
for a moment, that R satisfies 

VR^ -1<s-1<R<R+- <s. (35) 

s 

From Theorem 2.1 (rescaled by (1 — 1/logn)"^), we conclude that 

P{\S^(A') - 7r{xB')\ < VR^ - 1) < 2-("-'=)S(n -k,s-l) 

and 

P(|7r(5A') - tt(xb')\ <R+-)< 2-("-'=)5(n -k,s). 

s 
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On the other hand, by Stirhng's formula (if n is sufficiently large) we have 

^(2-("-'=)5(n -k,s-l)) + ^2-("-'=)5(n -k,s) = ^^"^7^ 

while 

^s + o(l) 



2-"S(n,s) 



vr n 



1/2 



and so we contradict ( 34 ) . 



An inspection of the above argument shows that all we need on R are the conditions 



(35). To satisfy the first inequality in (35), we need R < \/ {s — 1)^ + 1. Moreover, once 



s-1 < R< v^(s- 1)2 + 1, one can easily check that R + — < s holds automatically for 
any s > 3, concluding the proof. 



Appendix A. Proof of Theorem 17.61 



In this section, we sketch the proof of Theorem 7.6 



Embedding. The ffi'st step is to embed the problem into a finite field Fp for some prime 
p. In the case when the aj are integers, we simply take p to be a large prime (for instance 
P > 2"(Er=i + 1) suffices). 

If yl is a subset of a general torsion-free group G, we rely on the concept of Freiman 
isomorphism. Two sets A, A' of additive groups G, G' (not necessarily torsion-free) are 
Freiman-isomorphism of order k (in generalized form) if there is an bijective map / from A 
to A' such that /(ai) + • • • + f{ak) = /(o'l) + • • • + /(a'fc) in G' if and only if oi + • • • + afc = 
a[ + ■ ■ ■ + a'^ in G, for any subsets {ai, . . . , at} C A; {a'^, . . . , a'^} C A' . 

The following theorem allows us to pass from an arbitrary torsion-free group to Z or cyclic 
groups of prime order (see |67t Lemma 5.25]). 

Theorem A.l. Let A he a finite subset of a torsion-free additive group G. Then for any 
integer k the following holds. 

• there is a Freiman isomorphism (j) : A ^ 0(^) of order k to some finite subset </>(A) 
of the integers Z; 

• more generally, there is a map </> ; A — )• 0(^) to some finite subset 4'{A) of the 
integers Z such that 



ai -I h Oj = a'l H V a, <^ (t>{ai) H 1- </'(«i) = '/'(«i) + • • • '/'(a', 



for all i,j < k. 

The same is true if we replace Z by ¥„, if p is sufficiently large depending on A. 



Thus instead of working with a subset ^ of a torsion-free group, it is sufficient to work with 
subset of Fp, where p is large enough. From now on, we can assume that Oj are elements 
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of Fp for some large prime p. We view elements of Fp as integers between and p — 1. We 
use the short hand p to denote p{A). The next few steps are motivated by Halasz' analysis 
in 121]. 

Fourier Analysis. The main advantage of working in Fp is that one can make use of discrete 
Fourier analysis. Assume that 

p = p{A) = P{S = a), 
for some a G Fp. Using the standard notation ep{x) for exp{2iT\/'^^x/p), we have 

p = P(S = a) = E- ^ ep{t{S - a)) = E- ^ ep{tS)ep{-ta). (36) 
^ ieFp ^ teFp 

By independence 



Eep(t5) = llep{t^iai) = JJcos^^. (37) 

i=l i=l ^ 



It follows that 



p < ^ TT I cos I ~ ^ TT I ^"'^ I (38) 

teFp i ^ ^ teFp i ^ 

where we made the change of variable 4 — )• t/2 (in Fp) to obtain the last identity. 

By convexity, we have that |sin7r2;| > 2||z|| for any z G R, where ||2:|| := ||^;||r/z is the 
distance of z to the nearest integer. Thus, 

COS — < 1- -sin^ — < 1-2 - 
p 2 p p 

where in the last inequality we used that fact that 1 - 
Consequently, we obtain a key inequality 



<exp(-2||-||2), 



p 



(39) 



y < exp(— y) for any < y < 1. 



^ teFp i ^ ^ teFp i=l ^ 

Large level sets. Now we consider the level sets Sm '■= {t\ ||a,f/p|p < m}. We have 



n 



1 t 

<P<-T.^M-2^\\^\ 



teFp 



1 1 

< - + - V exp 



-2(m-l))|S„ 
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Since Ylm>i exp(— m) < 1, there must be is a large level set Sm such that 

\Sm\exp{-m + 2) > pp. (41) 

In fact, since p > n~*^, we can assume that m = O(logn). 

Double counting and the triangle inequality. By double counting we have 



II (lit ||2 II O-it ||2 

j=i te5™. ^ teSm i=i 



EE 11-11' ^Hs^^^i- 



So, for most ai 



y\\^f<9^\s^\ (42) 

^ — ' n n 

for some large constant Cq. 



p n 



Set Co = £ ^. By averaging, the set of ai satisfying (42) has size at least (1 — e)n. We call 



this set A' . The set A\A' has size at most en and this is the exceptional set that appears 



in Theorem 7.6 In the rest of the proof, we are going to show that A' is a dense subset of 



a proper GAP. 

Since || • || is a norm, by the triangle inequality, we have for any a G kA' 

y \\^f<k^9^\Sm\. (43) 

p n 

m 

More generally, for any / < k and a G lA' 

y \\"^f<k^9^\Sm\. (44) 

Q P n 

Dual sets. Define := {a\ X^jg5^ ll^lP ^ M)!*^"^!} ^^^^^ constant 200 is adhoc and any 
sufficiently large constant would do). can be viewed as some sort of a dual set of 5m,. 
In fact, one can show as far as cardinality is concerned, it does behave like a dual 



< r|^- (45) 

To see this, define Ta := J2t(^Sm Using the fact that cos27r2; > 1 — 100||z|p for any 

z G R, we have, for any a £ Sj^ 
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Ta> Y^{l-WO\\-f)>hSrr^. 



tes m 



One the other hand, using the basic identity X^aeFp ~ P^x=o, we have 

J2t!<2p\S^\. 

aeF„ 



(45) follows from the last two estimates and averaging. 



Set k := ci for a properly chosen constant ci = ci(Co). By (44) we have Ui^^lA' C 5, 



Set A = A' U {0}; we have kA C U {0}. This results in the critical bound 



\kA"\ =0(^) = 0{p-^exp{-m + 2)). (46) 



The role of ¥„ is now no longer important, so we can view the at as integers. Notice that 



(46) leads us to a situation similar to that of Freiman's inverse result (Therem 7.3). In that 
theorem, we have a bound on \2A\ and conclude that A has a strong additive structure. In 
the current situation, 2 is replaced by k, which can depend on \A\. We can, however, finish 
the job by applying the following variant of Freiman's inverse theorem. 

Theorem A. 2 (Long range inverse theorem, [39]). Let 7 > 6e constant. Assume that 
X is a subset of a torsion-free group such that £ X and \kX\ < k"'\X\ for some integer 
k > 2 that may depend on \X\. Then there is proper symmetric GAP Q of rank r = 0{j) 
and cardinality Ory(k~'^\kX\) such that X C Q. 



One can prove Theorem A.2| by combining Freiman theorem with some extra combinatorial 



ideas and several facts about GAPs. For full details we refer to 1391. 



The proof of the continuous version. Theorem |9.2[ is similar. Given a real number w and a 
variable ^, we define the ^-norm of w by Hu^H^ := (E||it;(^i — ^2)|P)"'^^^5 where ^1,^2 are two 
iid copies of ^. We have the following variant of Lemma |6.2[ 



Pr,^{A) < exp(7rr2) f exp(- ^ || (a^, z) ||2/2 - 7r\\zM)dz. (47) 

i=i 



2\ 



This will play the role of ( 38 ) in the previous proof. The next steps are similar and we refer 
the reader to 09] for more details. 
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We provide here a proof from |l6] (see also [E]). This proof is also influenced by Halasz' 
analysis from [21]. The starting point is again Esseen's bound. Applying Lemma 6.2, we 
obtain 



„ n 
JB{0,Vd) ^^-^ 



(48) 



where 4> is the characteristic function. 

Let S,' be an independent copy of and denote by ^ the symmetric random variable — 
Then we easily have |</)(t)| < exp(— ^(1 — E cos(27rt^))). 

Conditioning on S^' , the assumption sup^ P(^ G B{a, I)) < 1 — b implies that P(|^| > 1) > b. 
Thus, 



1 - Ecos(27rtO > P(kl > 1) • E( 1 - cos(27rtC) | |C| > 1 



> b ■ ^E( min |27rt^ - 27rgP I 1^1 > 1 
vr^ V qez 

= 166- E(^min|t^- | |^| > l) . 



Substituting of this into (48) and using Jensen's inequality, we get 



» n 

Pd pVdM) exp ( - 86e( min ^9, a^)//? - gj^ |e| >l))de 



( II ^ 

exp — 86 min — • a — p 

5(0, Vrf) ^ pGZ"ll/3 



\i\ > 1 



< C'^sup / exp(-86/2(0)) (le, 

2>1 J B{'d,y/d) 



where f{9) = niinpgzn 



■ a — p 



The crucial step is to bound the size of the recurrence set 



I(t) := 1^ e B{0,Vd) : f{9) < t}. 



Lemma B.l. We have 



t < a/2. 
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Proof, (of Lemma B.l) Fix t < a/2. Consider two points 9', 6" € I{t). There exist p' ,p" S 
Z" such that 



/3 



■ a — p 



< U 



nil II 

i ■ a — p 



< t. 



Let 



T := 



/3 



i ), p:=p -p . 



Then, by the triangle inequahty, 



\\t ■ a - p\\2 < 2t. (49) 
Recall that by the assumption of the theorem, 

{o.) ^ Thus, by the definition 

of the least common denominator, either ||t||2 > ^ or 

||r • a — p\\2 > min(7||T • a||2, a). (50) 



In the latter case, since 2t < a, (49) and (50) imply 



2t > 7||r • a\\2 > 



where the last inequality follows from (14). 



Thus we have proved that every pair of points 6' , 6" G I{t) satisfies: 



either \\9' - 9"\\2 > — =: R or \\9' - 9"\\2 < — =: r. 

z 



It follows that I{t) can be covered by Euclidean balls of radii r, whose centers are R- 
separated in the Euclidean distance. Since I{t) C B{0,^/d), the number of such balls is at 
most 



ti{B{0,Vd + R/2)) ^ / Y ^3Vd\d 
^l{B{0,R/2)) ~\ R ^ ) -\ R ) ' 

Summing these volumes, we obtain ^(/(t)) < (^^)™. □ 
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Proof, (of Theorem 10.2) First, by the definition of I{t) and as fi{B{0, ^fd) < C^, we have 



exp(-86/^(^)) de< exp(-26a2) ^9 

B(0,v^)\/(a/2) JB{0,Vd) 

< C"^exp(-26a2). 



(51) 



Second, by using Lemma B.l, we have 



l-a/2 

exp{-8bf{9)) dO = / 166t exp(-86t2)^(/(t)) dt 

/{a/2) Jo 



< 



< 



I6b(-^Y t-^+i exp(-86t2) dt 



CP 



7 



Vb 



(C'Py 



(52) 



Combining (51) and (52) completes the proof of Theorem 10.2 



□ 
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